Minimum Factorization Agreement of Spliced ESTs
نویسندگان
چکیده
Producing spliced EST sequences is a fundamental task in the computational problem of reconstructing splice and transcript variants, a crucial step in the alternative splicing investigation. Now, given an EST sequence, there can be several spliced EST sequences associated to it, since the original EST sequences may have different alignments against wide genomic regions. In this paper we address a crucial issue arising from the above step: given a collection C of different spliced EST sequences that are associated to an initial set S of EST sequences, how can we extract a subset C′ of C such that each EST sequence in S has a putative spliced EST in C′ and C′ agree on a common alignment region to the genome or gene structure? We introduce a new computational problem that models the above issue, and at the same time is also relevant in some more general settings, called Minimum Factorization Agreement (MFA). We investigate some algorithmic solutions of the MFA problem and their applicability to real data sets. We show that algorithms solving the MFA problem are able to find efficiently the correct spliced EST associated to an EST even when the splicing of sequences is obtained by a rough alignment process. Then we show that the MFA method could be used in producing or analyzing spliced EST libraries under various biological criteria.
منابع مشابه
A comparison of expressed sequence tags (ESTs) to human genomic sequences.
The Expressed Sequence Tag (EST) division of GenBank, dbEST, is a large repository of the data being generated by human genome sequencing centers. ESTs are short, single pass cDNA sequences generated from randomly selected library clones. The approximately 415 000 human ESTs represent a valuable, low priced, and easily accessible biological reagent. As many ESTs are derived from yet uncharacter...
متن کاملESTviewer: a web interface for visualizing mouse, rat, cattle, pig and chicken conserved ESTs in human genes and human alternatively spliced variants
ESTviewer is a web application for interactively visualizing human gene structures, with emphasis on mammalian and avian expressed sequence tags (ESTs) that are conserved in the human genome and alternatively spliced (AS) variants. AS variants from the UCSC, Vega and PSEP annotations are presented in this application for comparison. EST data from six species, human, mouse, rat, cattle, pig and ...
متن کاملEASED: Extended Alternatively Spliced EST Database
We established a database of alternative splice forms (ASforms) for nine eukaryotic organisms. ASforms are defined by comparing high-scoring ESTs with mRNA sequences using BLAST, taking known exon-intron information (from the Ensembl database). Filtering programs compare the ends of each aligned sequence pair for deletions or insertions in the EST sequence, which indicate the existence of alter...
متن کاملB?J/?(?,K) Decays within QCD Factorization Approach
We used QCD factorization for the hadronic matrix elements to show that the existing data, in particular the branching ratios BR ( ?J/?K) and BR ( ?J/??), can be accounted for this approach. We analyzed the decay within the framework of QCD factorization. We have complete calculation of the relevant hard-scattering kernels for twist-2 and twist-3. We calculated this decays in a special scale ...
متن کاملGene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus
MOTIVATION Accurate gene structure annotation is a challenging computational problem in genomics. The best results are achieved with spliced alignment of full-length cDNAs or multiple expressed sequence tags (ESTs) with sufficient overlap to cover the entire gene. For most species, cDNA and EST collections are far from comprehensive. We sought to overcome this bottleneck by exploring the possib...
متن کامل